LLVM and SPIRV-LLVM-Translator pulldown (WW15 2026)#21723
Conversation
…aries (#189044) We only did this for local variables but were were missing it for globals.
…ardOperands API to BranchOpInterface (#187864) To simplify the output of the reduction-tree pass, this PR introduces the eraseRedundantBlocksInRegion. For regions containing multiple execution paths, this functionality selects the shortest 'interesting' path. Additionally, this PR adds the getSuccessorForwardOperands API to BranchOpInterface. This allows us to extract the ForwardOperands for a specific path chosen from multiple alternatives, enabling the creation of a cf.br operation for the redirected jump.
…tions (#189113) Fixes llvm/llvm-project#187716.
…ssorForwardOperands API to BranchOpInterface" (#189150) Reverts llvm/llvm-project#187864, because it is causing same build bot failures. See https://lab.llvm.org/buildbot/#/builders/138/builds/27662 and https://lab.llvm.org/buildbot/#/builders/169/builds/21376/steps/11/logs/stdio for memory leak issues.
…on index (#188508) When a dynamic index of -1 (the kPoisonIndex sentinel) was folded into the static position of a vector.insert op, foldDenseElementsAttrDestInsertOp would proceed to call calculateInsertPosition, which returned -1. The subsequent iterator arithmetic (allValues.begin() + (-1)) was undefined behaviour, causing an assertion in DenseElementsAttr::get. Fix by bailing out early in foldDenseElementsAttrDestInsertOp when any static position equals kPoisonIndex, consistent with how InsertChainFullyInitialized already guards this case. Fixes #188404 Assisted-by: Claude Code
…nt (#189163) When invoking `-test-bytecode-roundtrip=test-dialect-version=X.Y` on a module that contains no test dialect operations, the reader type callback in `runTest0` called `reader.getDialectVersion<test::TestDialect>()` and then immediately asserted that it succeeded. However, if the test dialect was never referenced in the bytecode (because no test dialect types appear in the module), the dialect's version information is not stored in the bytecode, so `getDialectVersion` legitimately returns failure. When the test dialect version is unavailable in the bytecode being read, the module contains no test dialect types, so no "funky"-group overrides are needed and the callback can safely skip by returning `success()`. A regression test is added with a module that has no test dialect ops, exercising the `test-dialect-version=2.0` path that previously crashed. Fixes #128321 Fixes #128325 Assisted-by: Claude Code
… (#188064)
This PR adds two new field specifiers (`operand` and `attribute`) and
extends the existing one (`result`):
- `default_factory` parameter is added for `result` and `attribute` to
specify default value via a lambda/function
- `kw_only` parameter is added for all these three specifiers, to make a
field a keyword-only parameter (without giving a default value).
```python
def result(
*,
infer_type: bool = False,
default_factory: Optional[Callable[[], Any]] = None,
kw_only: bool = False,
) -> Any: ...
def operand(
*,
kw_only: bool = False,
) -> Any: ...
def attribute(
*,
default_factory: Optional[Callable[[], Any]] = None,
kw_only: bool = False,
) -> Any: ...
```
Examples about how to use them:
```python
class OperandSpecifierOp(TestFieldSpecifiers.Operation, name="operand_specifier"):
a: Operand[IntegerType[32]] = operand()
b: Optional[Operand[IntegerType[32]]] = None
c: Operand[IntegerType[32]] = operand(kw_only=True)
class ResultSpecifierOp(TestFieldSpecifiers.Operation, name="result_specifier"):
a: Result[IntegerType[32]] = result()
b: Result[IntegerType[16]] = result(infer_type=True)
c: Result[IntegerType] = result(
default_factory=lambda: IntegerType.get_signless(8)
)
d: Sequence[Result[IntegerType]] = result(default_factory=list)
e: Result[IntegerType[32]] = result(kw_only=True)
class AttributeSpecifierOp(
TestFieldSpecifiers.Operation, name="attribute_specifier"
):
a: IntegerAttr = attribute()
b: IntegerAttr = attribute(
default_factory=lambda: IntegerAttr.get(IntegerType.get_signless(32), 42)
)
c: StringAttr["a"] | StringAttr["b"] = attribute(
default_factory=lambda: StringAttr.get("a")
)
d: IntegerAttr = attribute(kw_only=True)
```
---------
Co-authored-by: Rolf Morel <rolfmorel@gmail.com>
Summary: These were renamed and the aliases removed, fix running the tests.
Signed-off-by: Shikhar Soni <shikharish05@gmail.com>
…89128) This fixes #186684. Also fix (not) breaking variables declared on the same line as the closing brace. And adapt whitesmith to that changes.
…efs (#188860) Fixes #188695
…ng and tests (#184365) Closes #181654
… broadcast from sg-to-wi (#185960) This PR adds distribution patterns for vector.step, vector.shape_cast & vector.broadcast in the new sg-to-wi pass
…. (#188721) If a load and a store have different address spaces, we cannot create a runtime check. Instead, always copy the data to an alloca matching the store address space. Fixes llvm/llvm-project#185236. PR: llvm/llvm-project#188721
Need to check if the potential bitcast/bswap-like construct is a root of the reduction, otherwise it cannot represent a bitcast/bswap construct. Fixes #189184
Add VPlan printing test for llvm/llvm-project#186252 llvm/llvm-project#189022
FormatTest.cpp is too huge, extract some tests to mitigate this a bit.
#184545 default-enables the IO sandbox in assert-builds. This causes Clang using Polly to crash (#188568). The issue is that `PassBuilder` uses `vfs::getRealFileSystem()` by default which is considered a IO sandbox violation in the Clang process. With this PR store the VFS from the `PassBuilder` from the original `registerPollyPasses` call for creating other `PassBuilder` instances. This PR also adds infrastructure for running Polly in `clang` (in addition in `opt`). `opt` does not enable the sandbox such that we need separate tests using Clang. Closes: #188568
On musl, rlimit64 is an alias for rlimit rather than a distinct type provided by glibc. Add a SANITIZER_MUSL elif branch so that struct_rlimit64_sz is defined for musl-based Linux targets.
…189199) Use start + (end - start) / 2 instead of (start + end) / 2 to compute the midpoint address. The original expression overflows when start + end exceeds UPTR_MAX, which happens on 32-bit targets whose memory layout includes regions above 0x80000000.
As a new contributor, it helps to correctly see the right maintainer.
…C (#189214) ICF's InputSection::replace() calls markDead() on folded sections, so `!sec->isLive()` already filters them.
|
Reverted in 54aa435. The missing symbols are implemented in libdevice, e.g. llvm/libdevice/nativecpu_utils.cpp Line 46 in c62d1d4 I have skipped native_cpu check-libclc in d93a810. This aligns with sycl branch.
LGTM. just added a minor code formatting in 6e622e5 to align with https://github.com/intel-restricted/applications.compilers.llvm-project/blob/ef83a191161833ae6a631d2a64630a88003e7ac0/libclc/CMakeLists.txt#L597-L601 |
@intel/dpcpp-nativecpu-reviewers, there is a gap in testing native_cpu. Please, improve testing to cover libclc. |
XFAIL first to unblock pulldown.
The code was accidentaly removed in intel-restricted/applications.compilers.llvm-project@4d19936#diff-0bcd40d8040a4c485a91c581d1f60f04570fa7d3fc168ff4999db520a33e82af . Fixes: Clang::Index/index-builtin-opencl.clcpp https://jira.devtools.intel.com/browse/CMPLRLLVM-74121
…682bd (#35866) Fix unit test failure in StencilTest.DescribeImplicitOperator caused by commit f4682bd which introduced staged lambda initialization. In SYCL/CUDA/OpenMP builds, getManglingNumber() is called between setLambdaContextDecl() and setLambdaNumbering(), which can trigger linkage computation for init-capture variables while the lambda is only partially initialized. This causes the cached linkage to differ from the linkage computed after full initialization, triggering the assertion in DeduceVariableDeclarationType. Solution: Invalidate cached linkage after deducing the type. This ensures linkage is recomputed with the complete type. Fixes regressions due to: * 6df388f 2026-03-13 [sycl-web] Reland 8ce2b9c zahira.ammarguellat@intel.com * f4682bd 2026-03-13 [Clang][ItaniumMangle] Fix recursive mangling for lambda init-captures (#182667) ototot@google.com --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix the sycl-oneapi-gpu-amdgpu.cpp test that broke after commit 2757b58 which merged changes from 'main' to 'sycl-web'. The merge introduced ToolChain::normalizeOffloadTriple() which automatically completes incomplete triples (e.g., "amdgcn" becomes "amdgcn-amd-amdhsa"). However, the SYCL driver should reject incomplete triples when directly specified by the user. Updated Driver.cpp:1389 to avoid normalizing user-provided triples for SYCL. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix SYCLNativeCPUUtils vecz transform to handle the upstream change in commit 3604119 that disallows calling getTerminator() on blocks without terminators. The rewireDivergentLoopExitBlocks function creates new basic blocks (newDivergentLE) without terminators, but later operations like DT->recalculate(), isReachable(), and predecessors() traverse the CFG and call getTerminator(), which now asserts on blocks without terminators. Applied the same workaround pattern used in commit 873322287a4312a7 (VPO/Paropt fix): temporarily add UnreachableInst terminators to newly created blocks before CFG traversal operations, then remove and replace them with proper terminators in computeNewTargets. Changes: - Add temporary UnreachableInst to newDivergentLE after creation - Use getTerminatorOrNull() instead of getTerminator() where blocks may temporarily lack terminators - Skip blocks without terminators in isReachable() CFG traversal - Handle placeholder UnreachableInst terminators in computeNewTargets Fixes: SYCL :: check_device_code/native_cpu/vectorization.cpp Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…lectives The unsigned variants (uchar, ushort, uint, ulong) of __clc__get_group_scratch functions were declared in collectives.cl but never defined in collectives_helpers.cl for both PTX and AMD targets. This caused linker errors during device code compilation when group broadcast operations used unsigned types. Add implementations that cast from the corresponding signed type's scratch memory. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…160956)" This reverts commit cc38e42.
|
This is ready for review.
@intel/dpcpp-nativecpu-reviewers |
|
@intel/llvm-gatekeepers This is ready for merge. Please help to issue /merge. Thanks. |
|
/merge |
|
Fri 17 Apr 2026 01:10:56 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
|
Fri 17 Apr 2026 01:24:38 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later. |
LLVM: llvm/llvm-project@7a3b7f1
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@b241000